Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try building rustc with N codegen units #81214

Closed
wants to merge 1 commit into from
Closed

Conversation

tmiasko
Copy link
Contributor

@tmiasko tmiasko commented Jan 20, 2021

r? @ghost

@tmiasko
Copy link
Contributor Author

tmiasko commented Jan 20, 2021

PR CI x86_64-gnu-llvm-9 took 77m (usually it takes ~40m).

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@bors
Copy link
Contributor

bors commented Jan 20, 2021

⌛ Trying commit ed6643a390a37d29cb0e8cf323bf3c46cf3a8c9e with merge ad1e2edcc154dde638611f902f9e504f5a8d9c6f...

@bors
Copy link
Contributor

bors commented Jan 20, 2021

☀️ Try build successful - checks-actions
Build commit: ad1e2edcc154dde638611f902f9e504f5a8d9c6f (ad1e2edcc154dde638611f902f9e504f5a8d9c6f)

@rust-timer
Copy link
Collaborator

Queued ad1e2edcc154dde638611f902f9e504f5a8d9c6f with parent a4cbb44, future comparison URL.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 20, 2021
@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (ad1e2edcc154dde638611f902f9e504f5a8d9c6f): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jan 20, 2021
@tmiasko
Copy link
Contributor Author

tmiasko commented Jan 23, 2021

The size of unpacked toolchain was also reduced from 404M to 354M (as installed by rustup-toolchain-install-master with default options).

Now with two codegen units (PR CI x86_64-gnu-llvm-9 took 63m).

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@bors
Copy link
Contributor

bors commented Jan 23, 2021

⌛ Trying commit 4f9f69512c7c3ff4fdfa298c28f2d339015f0b2d with merge 4c4da1d578ee027b619ce11c645e6de512d26927...

@bors
Copy link
Contributor

bors commented Jan 24, 2021

☀️ Try build successful - checks-actions
Build commit: 4c4da1d578ee027b619ce11c645e6de512d26927 (4c4da1d578ee027b619ce11c645e6de512d26927)

@rust-timer
Copy link
Collaborator

Queued 4c4da1d578ee027b619ce11c645e6de512d26927 with parent 4d0dd02, future comparison URL.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 24, 2021
@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (4c4da1d578ee027b619ce11c645e6de512d26927): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 24, 2021
@bjorn3
Copy link
Member

bjorn3 commented Jan 24, 2021

Huge improvements of up to 15.5%.

@tmiasko
Copy link
Contributor Author

tmiasko commented Jan 24, 2021

With three codegen units (PR CI x86_64-gnu-llvm-9 took 44m).

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@bors
Copy link
Contributor

bors commented Jan 24, 2021

⌛ Trying commit 7848a41345dcccb14b68adaee9526ebd2117c6be with merge 01e13b863b2b389489961bf7431b7a315264a43e...

@bors
Copy link
Contributor

bors commented Jan 24, 2021

☀️ Try build successful - checks-actions
Build commit: 01e13b863b2b389489961bf7431b7a315264a43e (01e13b863b2b389489961bf7431b7a315264a43e)

@rust-timer
Copy link
Collaborator

Queued 01e13b863b2b389489961bf7431b7a315264a43e with parent 85e355e, future comparison URL.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jan 24, 2021
@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (01e13b863b2b389489961bf7431b7a315264a43e): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 11, 2021
@tmiasko
Copy link
Contributor Author

tmiasko commented Feb 11, 2021

With rust.codegen-units=4 (PR CI x86_64-gnu-llvm-9 took 44 min):

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 11, 2021
@bors
Copy link
Contributor

bors commented Feb 11, 2021

⌛ Trying commit 935f7819e5068b1d43cd9dbe904eb626d2961987 with merge bb215a40be948264811fbac2da595208a74e625c...

@bors
Copy link
Contributor

bors commented Feb 11, 2021

☀️ Try build successful - checks-actions
Build commit: bb215a40be948264811fbac2da595208a74e625c (bb215a40be948264811fbac2da595208a74e625c)

@rust-timer
Copy link
Collaborator

Queued bb215a40be948264811fbac2da595208a74e625c with parent 2918062, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (bb215a40be948264811fbac2da595208a74e625c): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 11, 2021
@tmiasko
Copy link
Contributor Author

tmiasko commented Feb 11, 2021

With rust.codegen-units=5 (PR CI x86_64-gnu-llvm-9 took 42 min):

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 11, 2021
@bors
Copy link
Contributor

bors commented Feb 11, 2021

⌛ Trying commit f8b67ff with merge 03192e87b7e4444897b09bec2729882ed234cbdd...

@bors
Copy link
Contributor

bors commented Feb 11, 2021

☀️ Try build successful - checks-actions
Build commit: 03192e87b7e4444897b09bec2729882ed234cbdd (03192e87b7e4444897b09bec2729882ed234cbdd)

@rust-timer
Copy link
Collaborator

Queued 03192e87b7e4444897b09bec2729882ed234cbdd with parent e9920ef, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (03192e87b7e4444897b09bec2729882ed234cbdd): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 12, 2021
@Mark-Simulacrum Mark-Simulacrum removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 12, 2021
@tmiasko
Copy link
Contributor Author

tmiasko commented Feb 14, 2021

The table updated with new measurements is in #81214 (comment)

@Mark-Simulacrum what would be next steps? Is that sufficient to make a decision, or would you like me to repeat try builds once again?

@bjorn3
Copy link
Member

bjorn3 commented Feb 14, 2021

I am curious what the effect of only building libstd with a single codegen unit would be.

@tmiasko
Copy link
Contributor Author

tmiasko commented Feb 14, 2021

The standard library is already built with a single codegen unit.

@Mark-Simulacrum
Copy link
Member

I might be misunderstanding the table, but by my reading it looks like switching to bootstrapping with codegen units affecting only rustc has actually introduced some pretty major regressions - maybe something has changed on master in the meantime?

I guess I am not feeling like we have a complete picture of the constraints here and the effects of various switches. Bootstrap columns in the table seem pretty useless to me -- they're not indicative of a difference between variant 1 and 2 in terms of performance of rustc at runtime.

There are several inputs which can be changed:

  • number of codegen units used for rustc
  • number of codegen units used for std
  • number of codegen units used for tooling like cargo etc (in theory this has no effect, but really we should validate that assumption)

From each of these we have several metrics to evaluate the effect:

  • time spent on CI producing these artifacts
  • size of produced artifacts on disk
  • runtime performance of the artifacts, likely ideally represented as "how close we get to optimal, with these toggles"
    • likely this wants to be roughly "average change" and "worst regression", but ultimately just needs to be a link to perf.rlo

I think we don't want to alter the settings when perf.rlo is run, as that skews our data and generally doesn't make for something that's easy to compare. That means any changes in configuration in this PR should be gated on Rust's CI (somehow) rather than just any x.py run.

Right now the table I think I'm struggling to read the table in a conclusive way - I don't know exactly what would help, but it seems like the primary problem with the current table is that it combines data along these axes in a way that at least for me is hard to compare. Maybe splitting apart into multiple tables or putting each metric in its own table would be helpful, I'm not sure.

@tmiasko
Copy link
Contributor Author

tmiasko commented Feb 14, 2021

it looks like switching to bootstrapping with codegen units affecting only rustc has actually introduced some pretty major regressions - maybe something has changed on master in the meantime?

As far as I can see that is mostly a reflection to what degree optimizer decision made when building rustc affect the results, and how inaccurate benchmarks are as a consequence. After spending some time profiling I am entirely unsurprised given that: 50% of all instructions in inflate are in just one function, for keccak in 2 functions, for match-stress-enum in 2 functions, for ctfe-stress-4 around 10. Just like in match-stress-enum case it should be easy to isolate and direct optimizer towards making the same decisions it is making now, if one desired to do so.

Given that reducing the number of CGUs for tooling takes around 10 min longer in try build ("Try 1" vs "Try 2"), that building with one or two CGUs takes at least 10 min longer ("Try 1" & "Try 2") than other options, and that builds with one, two, three, and four CGUs have regressions in perf results. What about reducing the number of CGUs for rustc to 5, which reduces the size of toolchain by 8 MB, took 54 min in try builder and gave following perf results?

@tmiasko tmiasko closed this Feb 27, 2021
@tmiasko tmiasko deleted the 1 branch February 27, 2021 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants